Skip to content

fix(helm): supervisor OTLP endpoint resolves cross-namespace#3504

Open
nicktrn wants to merge 1 commit intomainfrom
fix/helm-supervisor-otel-fqdn
Open

fix(helm): supervisor OTLP endpoint resolves cross-namespace#3504
nicktrn wants to merge 1 commit intomainfrom
fix/helm-supervisor-otel-fqdn

Conversation

@nicktrn
Copy link
Copy Markdown
Collaborator

@nicktrn nicktrn commented May 1, 2026

Reported by external contributor. The supervisor template hardcoded a short DNS name for OTEL_EXPORTER_OTLP_ENDPOINT, which the supervisor then propagates verbatim into runner pods (apps/supervisor/src/workloadManager/kubernetes.ts:196). When runners are spawned in a different namespace via supervisor.config.kubernetes.namespace, the short name doesn't resolve and span/log export silently fails - runs complete fine but the dashboard shows nothing.

Same FQDN pattern the chart already uses for TRIGGER_WORKLOAD_API_DOMAIN (line 203). Verified with helm template trigger . --namespace my-ns - renders http://trigger-webapp.my-ns.svc.cluster.local:3030/otel.

Cheers Niels

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

⚠️ No Changeset found

Latest commit: 7269d4f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7f93e41e-09a0-40cf-99a6-917e0b399edf

📥 Commits

Reviewing files that changed from the base of the PR and between b19cf6d and 7269d4f.

📒 Files selected for processing (1)
  • hosting/k8s/helm/templates/supervisor.yaml
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: audit
  • GitHub Check: Analyze (actions)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-27T22:45:00.623Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3114
File: apps/supervisor/src/index.ts:252-281
Timestamp: 2026-03-27T22:45:00.623Z
Learning: In `apps/supervisor/src/index.ts`, compute supervisors (COMPUTE_GATEWAY_URL set) and K8s/Docker supervisors are always separate deployments. A compute supervisor will never receive K8s/Docker checkpoint messages, so there is no routing mismatch between the compute restore path and the legacy `checkpointClient.restoreRun()` path. Do not flag this as an architectural concern.

Applied to files:

  • hosting/k8s/helm/templates/supervisor.yaml
🔇 Additional comments (1)
hosting/k8s/helm/templates/supervisor.yaml (1)

239-240: ⚡ Quick win

OTEL endpoint FQDN looks correct for cross-namespace DNS; please verify webapp namespace source.

This change updates OTEL_EXPORTER_OTLP_ENDPOINT to a cluster-local FQDN (<service>.<namespace>.svc.cluster.local), which should resolve from runner pods created in a different namespace and restore telemetry export.

One verification item: confirm the webapp Service ({{ include "trigger-v4.fullname" . }}-webapp) is always created in {{ .Release.Namespace }}. If the chart supports deploying the webapp into a different namespace via values, this template should use that namespace value rather than .Release.Namespace.


Walkthrough

The supervisor container's OTEL_EXPORTER_OTLP_ENDPOINT environment variable in the Helm template was updated from a short Kubernetes service hostname to a fully-qualified DNS name. The endpoint format changed from <fullname>-webapp:<port>/otel to <fullname>-webapp.<namespace>.svc.cluster.local:<port>/otel to use the standard Kubernetes cluster-local DNS resolution pattern for cross-namespace service discovery.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: fixing the Helm supervisor template's OTLP endpoint to resolve across Kubernetes namespaces using a fully-qualified DNS name.
Description check ✅ Passed The description provides context about the issue, explains the root cause, references the specific code location, includes reproduction steps, and notes the fix pattern already used elsewhere in the chart.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/helm-supervisor-otel-fqdn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

🧭 Helm Chart Prerelease Published

Version: 4.4.5-pr3504.7269d4f

Install:

helm upgrade --install trigger \
  oci://ghcr.io/triggerdotdev/charts/trigger \
  --version "4.4.5-pr3504.7269d4f"

⚠️ This is a prerelease for testing. Do not use in production.

@nicktrn nicktrn added the ready label May 1, 2026
@nicktrn
Copy link
Copy Markdown
Collaborator Author

nicktrn commented May 1, 2026

ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants